(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property 
Organization 

International Bureau 

(43) International Publication Date 
2 June 2005 (02,06.2005) 





PCT 



(10) International Publication Number 

WO 2005/050621 A2 



(51) International Patent Classification 7 : G10L 15/22 

(21) International Application Number: 

PCT/TB 2004/05 2403 

(22) Internationa] Filing Date: 

12 November 2004 (12.1 1.2004) 

(25) Filing Language: English 

(26) Publication Language: English 

(30) Priority Data: 

03104314.4 21 November 2003 (21.11.2003) EP 

(71) Applicant (for DE only): PHILIPS INTELLECTUAL 
PROPERTY & STANDARDS GMBH [DE/DE]; Stein- 
damm 94, 20099 Hamburg (DE). 

(71) Applicant (for all designated States except DE, US): 
KONINKLUKE PHILIPS ELECTRONICS N. V. 



[NL/NL]; Groenewoudseweg 1, NL-5621 BA Eindhoven 
(NL). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): PETERS, Jochen 
[DE/DE]; c/o Philips Intellectual Property & Standards 
GmbH, Weisshausstr. 2, 52066 Aachen (DE). MATUSOV, 
Evgeny [RU/DE]; c/o Philips Intellectual Property & 
Standards GmbH, Weisshausstr. 2, 52066 Aachen (DE). 
MEYER, Carsten [DE/DE]; c/o Philips Intellectual 
Property & Standards GmbH, Weisshausstr. 2, 52066 
Aachen (DE). KLAKOW, Dietrich [DE/DE]; c/o Philips 
Intellectual Property & Standards GmbH, Weisshausstr. 2, 
52066 Aachen (DE). 

(74) Agents: MEYER, Michael et ah; Philips Intellectual 
Property & Standards GmbH, Weisshausstr. 2, 52066 
Aachen (DE). 

(81) Designated States (unless otherwise indicated, for every 
kind of national protection available): AE, AG, AL, AM, 
AT, AU, AZ, BA, BB, BG, BR, BW, BY, BZ, CA, CH, CN, 

[ Continued on next page] 



(54) Title: TOPIC SPECIFIC MODELS FOR TEXT FORMATTING AND SPEECH RECOGNITION 



o 

i/5 



o 



speech 




200 



perform first 
speech recognition 



202 



perform text structuring / 
select language model 
for each text section 






perform secc 
recognition o 
of first speec 


md speech 
r re-scoring 
h recognition 



-204 








provide language 






models 



■208 



provide L^210 
transcribed text 



(57) Abstract: The present invention relates to 
a method, a computer system and a computer 
program product for speech recognition and/or 
text formatting by making use of topic specific 
statistical models. A text document which may 
be obtained from a first speech recognition pass is 
subject to segmentation and to an assignment of 
topic specific models for each obtained section. 
Each model of the set of models provides statistic 
information about language model probabilities, 
about text processing or formatting rules, as e.g. 
the interpretation of commands for punctuation, 
formatting, text highlighting or of ambiguous text 
portions requiring specific formatting, as well 
as a specific vocabulary being characteristic for 
each section of the recognized text. Furthermore, 
other properties of a speech recognition and/or 
formatting system (such as e.g. settings for the 
speaking rate) may be encoded in the statistical 
models. The models themselves are generated 
on the basis of annotated training data and/or 
by manual coding. Based on the assignment of 
models to sections of text an improved speech 
recognition and/or text formatting procedure is 
performed. 
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